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ABSTRACT 

The  reconstruction  of  conti nuous-time  signals  s(t)  from  the  sign  of  their 
(deliberately)  contaminated  samples  is  considered.  Sequential,  generally  non¬ 
linear  estimates  of  s(t)  are  established  and  their  performance  is  studied; 
error  bounds  and  convergence  rates  are  derived.  The  signal  s(t)  need  not  be 
bandlimited.  The  convergence  rates  obtained  here  are  faster  than  those  ob¬ 
tained  in  [4]  for  nonsequential  estimates.  The  degradation  in  the  reconstruc¬ 
tion  of  the  signal,  due  to  transmission  over  an  arbitrary  noisy  channel,  is 
also  investigated  and  bounds  on  the  additional  error  are  obtained. 
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I,  INTRODUCTION 


This  paper  is  concerned  with  the  problem  of  reconstructing  a  continuous- 
time  signal  s(t),  -  »  <  t  <  ®,  from  its  sign  (clipped  version)  sgn[s(t)], 

-  »  <  t  <  *.  In  general  the  binary  signal  sgn[s(t)]  does  not  uniquely  de¬ 
termine  s(t)  even  when  s(t)  is  an  analytic  function.  For  example,  when  $(t) 
is  a  bandlimited  function  and  the  hard-limiter  sgn  xis  replaced  by  a  strictly 
monotonic  transformation  ip(x) ,  then  $(t)  can  be  recovered  from  the  bandlimited 
version  of  tp[s(t)]  by  using  a  recursive  algorithm  based  on  the  principle  of 
contraction  mapping;  this  was  accomplished  by  Landau  [1]  for  conventionally 
bandlimited  functions  and  by  Masry  and  Cambanis  [2]  for  bandlimited  (in  the 
sense  of  Zakai)  stochastic  signals.  No  such  results,  however,  are  available 
when  t|>(x)  a  sgn  x  whether  s(t)  is  bandlimited  or  not  (see,  for  example,  [3]). 

In  [4]  we  proposed  a  digital  scheme,  modelled  as  a  transmitter/ receiver 
as  in  Figure  1,  whereby  the  signal  s(t)  -  not  necessarily  bandlimited  -  is 
sampled  periodically  at  a  fixed  sampling  rate  W.  The  samples  {s(k/W)k  are 
deliberately  contaminated  by  additive  noise  {Xk)k  having  an  appropriate  dis¬ 
tribution  F(x)  (in  practical  applications,  {Xk)k  are  computer  generated  random 
numbers  drawn  from  the  distribution  F(x)).  The  sign  of  the  contaminated  samples 
(s(k/W)  +  Xk>k  is  then  obtained,  i.e.,  (Zw  k  *  sgn[s(k/W)  + Xk])k.  It  was  then 
shown  in  [4]  that  estimates  sw(t)  of  s(t),  based  on  the  ±1  sequence  {Zwk}k, 
exist  such  that  s^(t)  converges  (with  probability  one)  to  s(t)  as  the  sampling 
rate  W  tends  to  infinity.  Note  that  in  the  absence  of  (Xklk,  s(t)  cannot  be 
reconstructed  from  {sgn[s(k/W  )])k  as  W  -*•  ®. 

This  paper  continues  the  Investigation  begun  in  [4]  and  has  several  objec¬ 
tives:  We  first  note  that  the  estimates  considered  in  [4]  were  nonsequential, 
i.e.,  they  reconstruct  (s(t),  t  >  0}  from  the  entire  data  set  ®ur 

first  objective  is  to  establish  sequential  estimates  of  s(t)  and  study  their 


1 


performance.  We  seek  to  obtain  tight  bounds  on  the  moments  of  the  error 
sw(t)  -  s(t)  for  a  finite  sampling  rate  W,  the  establishment  of  mean  and  prob¬ 
ability  one  rates  of  convergence  as  W  ■*•  »,  and  a  central  limit  theorem  for  the 
error  sw(t)  -  s(t).  These  results  for  the  sequential  estimates  are  sharper 
than  those  obtained  in  [♦];  for  example,  the  rate  of  convergence  of  the  mean- 

square  error  is  at  most  W"1/2  for  the  nonsequential  estimates  of  [4]  and  is 
-2/3 

W  for  the  sequential  estimates  considered  here.  A  second  objective  of  this 
paper  is  the  investigation  of  the  effect  of  channel  noise  when  the  binary  data 
tZw  k>k  *s  transmitted  over  a  noisy  channel  (see  Fig.  2).  This  is  considered 
in  Section  III  where  bounds  on  the  additional  error  in  the  estimation  of  s(t), 
due  to  channel  noise,  are  derived.  Distinct  results  are  obtained  for  the  white 
and  colored  channel  noise  cases. 

The  approach  used  in  [4]  and  in  this  paper,  to  deliberately  contaminate  the 
signal  s(t)  before  quantization,  can  be  viewed  in  two  ways.  One  is  that  of 
"dithering"  -  a  concept  which  has  been  used  In  the  past  for  correlation  func¬ 
tion  estimation  [5]  [6],  digital  match  filtering  [7],  and  other  aspects  of 
digital  signal  processing  as  reviewed  in  [8]  (see  also  the  paper  by  Root  [9] 
in  the  context  of  communi cation  through  unspecified  additive  noise).  Alter¬ 
natively,  it  can  be  viewed  as  random  quantization  for  deterministic  continuous¬ 
time  signals,  The  Idea  of  random  quantization  has  recently  been  advocated  by 
Papantonl-Kazakos  [10]  for  discrete-time  stochastic  signals  and  shown  to  be 
essential  for  stability  under  perturbations  In  the  statistical  description  of 
the  signals. 

The  organization  of  the  paper  Is  as  follows.  In  Section  II  we  consider 
the  noise-free  channel  case  and  derive  the  convergence  properties  of  the 
sequential  estimate  sw(t).  In  Section  III  we  consider  the  effect  of  channel 
noise  on  the  performance  of  the  estimate  sy(t).  Both  Sections  II  and  III  con¬ 
tain  a  discussion  on  the  choice  of  the  parameters  of  the  transmitter/receiver 
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such  that  the  reconstruction  of  the  signal  is  achieved  with  an  error  not  ex¬ 
ceeding  a  prescribed  level.  Section  IV  is  a  collection  of  remarks  on  certain 

unresolved  questions  in  this  area.  The  Appendix  contains  certain  auxilliary 
propositions  needed  for  the  derivations  in  Section  II,  as  well  as  a  supplement 
to  Section  III. 

II.  THE  RECONSTRUCTION  OF  THE  SIGNAL  (NOISE-FREE  CHANNEL) 

A.  Preliminaries 

We  consider  the  noise-free  channel  case,  as  depicted  in  Figure  1,  and 
specify  admissible  distributions  F(x)  of  {Xk),  linear  systems  L,  and  memoryless 
nonlinearities  g(x),  such  that  sy(t)  is  a  sequential  estimate  of  s(t);  the  con¬ 
vergence  properties  of  sw( t )  are  then  obtained.  Throughout  this  paper  it  is 
assumed  that  the  signal  s(t)  belongs  to  the  following  class  of  signals. 

Assumption  A.  Let  b  be  a  fixed  known  positive  constant  and  let  s(t),  t  >  0, 
be  any  uniformly  continuous  function  satisfying  |s(t)|£  b  for  all  t  >_  0. 

The  constant  b  is  simply  a  peak  constraint  on  the  signal  s(t).  Aside  from 
the  knowledge  of  b,  the  receiver  structure  and  the  convergence  results  of  this 
paper  are  nonparametric  in  the  signal.  Next  it  is  assumed  that  the  contaminating 
random  numbers  constitute  a  sequence  of  independent  identically  dis¬ 

tributed  random  variables  with  a  synwetric  distribution  F(x).  The  following 
argument  provides  the  rationale  for  the  recovery  scheme  and  determines  the  class 
of  admissible  distributions  F(x).  Let  X  be  a  random  variable  wtth  a  symmetric 
distribution  function  F(x)  and  define  the  moment  function  u(s)  by 

u(s)  *  E[sgn(s  +  X)],  -  *  <  s  <  «. 

Then 

3 


t 


u(s)  -  PCX  >  -s]  -P[X  <  -s] 

-  1  -  2F(-s') 
a  2F(s)  -1 »  -  00  <  s  <  °°. 

Thus  u(s)  Is  strictly  monotonic  on  an  interval  (-c,c)  if  and  only  if  F(x)  is 
strictly  monotonic  on  (-c,c).  Any  distribution  F(x)  which  is  strictly  mono¬ 
tonic  on  an  interval  (-c,c)  for  some  c  >_  b  is  an  admissible  distribution  for 
the  sequence  {X^}  in  the  transmitter.  Now  let 

m(t)  a  E[sgn(s(t)  +  X)],  t  >  0  (2) 

be  the  mean  function  of  the  hard -limiter  output, with  input  s(t)  +  X,  then  it 
is  seen  that 

m(t)  a  u[s(t)]f  t  >  0  .  (3) 

By  the  strict  monotonicity  of  u(0  over  (-c,c),  for  some  c  >  b,  we  then 
have  s(.t).  =  u  (m(t ) ) .  Hence,  in  principle,  an  estimate  s(t)  of  s(t)  can  be 

obtained  from  an  estimate  m(t}  of  m(t}  via  s(t)  a  u-1tm(t));  m(t)  can  be  ob¬ 
tained  from  the  binary  data  (Zw  ^  in  a  linear  recursive  manner,  This  ex¬ 
plains  the  structure  of  the  recovery  scheme  in  Fig.  1.  Some  refinement  of 
the  above  argument  is  needed,  however,  since  m(t)  need  not  take  values  in  the 
interval  [u(-b),p(b)J  whereas  m(t)  does.  This  will  become  clear  below. 

Convergence  results  can  be  obtained  for  any  admissible  distribution  F(x) 

specified  above;  however,  in  order  to  provide  explicit  bounds  on  the  mean-square 
error  of  the  estimates,  we  shall  concentrate  on  three  typical  distributions. 

When  X  is  uniform  over  [-b,b]  we  have 


UyCsj 


s  <  -b 

-b  £  s  <.  b 
b  <  s. 


-1  , 
s/b, 
1  , 


4 


(4) 
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When  X  is  normal  N(0,o  )  we  have 

yN($J  -  2$(s/o)-1,  .  »  <  s  <  »,  (5) 

where  4>(x)  is  the  standard  normal  distribution  function.  When  X  is  Laplacian 
(f(x)  *  (a/2)exp(-ct|x| ))  we  have 

uL(s)  *  (sgn  s)  (1  s ^ ) ,  -*<s«».  (6) 

We  now  specify  the  memoryless  nonlinearity  g(x)  in  the  receiver  by  g(x) 
*u-1(x)  over  an  interval  containing  [y(-b),  y(b)]t  and  by  g(x)  *  0  elsewhere. 
For  the  three  chosen  distributions  F(x)  we  have  i^(s)  and  u^($)  are  invertible 
over  the  entire  real  line  while  Uy(s)  is  invertible  over  [-b,b],  and  we  define 
g(x)  as  follows. 

Assumption  8.  We  say  that  (8)  is  satisfied  if  any  one  of  (Bl),  (B2),  or 
(B3)  is  satisfied. 

(Bl):  X  is  uniform  over  [-b,b]  and 

!bx  ,  |x|  <  1, 

0  .  |x|  >  1. 

(B2):  X  is  normal  N(0,c^)  and 


g„<x)  • 


(x)  i  ( x  |^_  u^(c) 
0  ,  otherwise 

. 


c  •  b  +  e,  e  >  0. 


(B3):  X  is  Laplacian  and 

(  -  J(sgn  x)£n(l  -|x|)  ,  |x|  <  l-e'oC 

9,(x)  »<  .  c  -  b  ♦  e,  e  >  0. 

I  0  ,  otherwise 

A 

We  next  specify  the  linear  system  L  in  the  receiver  whose  output  my(t)  provides 
an  estimate  of  the  function  m(t)  given  in  (3). 
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We  consider  sequential  estimates  using  a  sliding  window  on  blocks  of  a  fixed  size 
N  of  the  data  (Zy  k):  First  m(t)  is  estimated  at  the  sampling  points {k/W}£=0  by 
.  k  ,  N-l 

mW(W)  *  fT  ^0ZW,i+k  •  k  *  0,1,...,  (7) 

and  then  m(t)  is  estimated  either  by  the  step  function 

"w(t)  *  I  %fg>  '[j.  W*1  (8a> 

k»0 

or  by  the  piecewise  linear  function 

*<t)  a  J0  CCWt-k)  my(*±I)  +  [1-(Wt-k)]  l^  k+i) ( t ) .  (8b) 

The  estimate  sy(t),  t  >  0,  Is  then  given  by 

*y(t)  3  gCmyCt)]  (9) 

■"s 

where  g(x)  is  specified  by  Assumption  (B)  and  mw( t )  is  given  by  (8a)  or  (8b). 

/N 

Thus  sw(t)  is  obtained  sequentially  with  delay  not  exceeding  N/W  and  (N+l)/W 

a 

for  (8a)  and  (8b),  respectively.  The  estimate  sw(t)  determined  by  (8b)  is  con¬ 
tinuous  in  t  whereas  the  estimate  determined  by  (8a)  is  a  step  function  in  t, 
and  thus  the  former  may  be  considered  a  more  suitable  estimate  for  the  con¬ 
tinuous  signal  s(t).  Note  that  when  X  is  uniform  over  [-b.b],  we  have 


sw(t)  *  bmyU),  t  >  0  (10) 

A 

since  only  the  linear  portion  of  gy(x)  is  used  (as  by  (8a)  and  (8b),  |my(t)|  <  1). 
In  this  case,  therefore,  the  estimate  sw(t)  is  linear  in  the  data  (zw,k^k*0‘ 

B.  Bounds  on  Mean-Square  Error  and  Discussion 

Our  first  result  provides  a  bound  on  the  mean-square  error  of  the  estimate 
sw(t).  It  Is  stated  In  terms  of  the  modulus  of  continuity  ui(s;5)  of  s(t),  which 
is  defined  for  each  6  >  0  by 
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oj(s;6)  =  sup  |s(t)  -  s ( t ') | , 

{t,t*  >  0:  |t-t'l<6} 

and  which  tends  to  zero  as  6-K)  by  the  uniform  continuity  of  s(t)  over  [0,®). 
Eventhough  we  state  all  results  for  signals  defined,  and  uniformly  continuous, 
over  the  entire  half  line  [0,“)»  the  same  results  are  of  course  valid  for  con¬ 
tinuous  signals  defined  over  finite  intervals  [0,T]  with  the  modulus  of  contin¬ 
uity  over  [0,®)  replaced  by  that  over  [0,T]. 

A 

Theorem  2.1.  Under  Assumptions  (A),  (B),  the  estimate  sw(t)  satisfies 
E[sw(t)  -  s(t)]2  <  K1  ui2(s;^)  +  K2  l 

uniformly  in  t  ^  0.  The  constants  K-|  and  K2  are  determined  by  (B)  as  follows: 

For  (Bl):  K]  =  4  ,  <2  =  b2 

For  (B2) :  K]  *  ^K2,  <2  =  \naZ  ^ ^  (l+b2/e2) 

For  (B3) :  K1  *  4«2  K2  ,  «2  *  a'2e2aC  (l+b2/e2) 

Theorem  2.1  shows,  in  particular,  that  the  estimate  sw(t)  converges  to  s(t) 
in  the  mean-square  sense  as  W-*»,  uniformly  in  t  ^  0,  provided  the  block  size  N 
is  chosen  to  depend  on  W,  N  *  Nw,  such  that 

NW 

Ny-*«  and  -K)  as  W-*®.  (11) 

Proof.  We  first  note  that  the  function  m(t)  =  u[s(t)]  is  also  uniformly  con¬ 
tinuous  on  [0,®)  since  u(s),  given  in  (4)-(6),  is  diffe-entiable  over  [-b,b]  with 

.  ( 1/b  ,  under  (Bl ) 

a  1  _ 

Q  *  max  y'(s)  *  /  JTRic  ,  under  (B2)  (12) 

lsl-b  (a  ,  under  (B3) , 
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i.e., 


oj(m;<s)  <  Q  uj(s;6).  (13) 

A 

Next  we  obtain  a  bound  on  the  mean-square  error  of  the  estimates  my(t),  given  in 

/\ 

(8a)-(8b).  The  corresponding  bound  for  Sy(t)  follows  then  from  Proposition  1  of 
the  Appendix. 

/V  A 

a)  For  the  estimate  (8a)  and  for  each  fixed  t  >  0,  we  have  my(t)  *  m^k/W) 
where  k  is  such  that  k/W  <  t  <(k+l)/W.  Since 

£[ZWtkD  »u[s(£)]  *  m(g) 

we  have  by  (7) 

1  tH  i+k 
EC«ta(t>3  *  f  j=Q 

which  is  a  positive  linear  operator  on  the  function  m(u),  u  >  0,  and  by  a  well- 
known  result  in  approximation  theory  [11,  pp.  28-29] 

tB1asCmM(t)3|  *  |E[mw(t)]  -  m(t)|<  2  0>(m;aw?k(t)) 


where 


N-1  , 


■ »  £  <s>2  - 1  <*-  s>  to  4 +  (t'  «>2 

k  1 

The  second  term  above  is  <  0  for  N  >  2  since  0  <  t  -  g<  g.  Hence 
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(14a) 


2 

As  the  bound  on  cty  k(t)  does  not  depend  on  k,  we  have  for  all  t  >  o 
|Bias[my(t)J  j  <_  2  uj(m;  ^---) . 

2 

Now  since  (Zw  k>  are  independent  with  Var[Zw  k]  <_  E[ZW  k]  *  1 ,  it  follows  from 
(7)  that 


Var  ‘  7  VartZW,Hk:l  i  f  '  <15a> 

Thus  for  all  t  >  0  we  have  by  (14a)  and  (15a) 

E[mw(t)  -  m(t)]2  <‘  4  J(m;  -$-)  +  I 

/J  W 

A 

and  the  corresponding  bound  for  sw(t)  follows  from  (13)  and  Proposition  1  of  the 
Appendix. 

b)  For  the  estimate  (8b)  and  for  each  fixed  t  >  0  with  k/W  <  t<  (k+l)/W, 
(8b)  can  be  written  in  the  form 


-  k+N 

mW(t)  =  J=k  hW,k(t’^  ZW,i 


06) 


where 


iq-  [l-(Wt-k)] 


^W,k^’^  '  N 


K  (Wt-k) 


i  =  k 

i=k+l,...,k+N-l 
i  =  k  +  N. 


(17) 


Then 


k+N 


EtmyU)]  =  1  hw<R(t,i)  m(g) 

i=k 

which  is  a  positive  linear  operator  on  the  function  m(u),  u  >  0  since  hw  k(t,i)  >  0 
k+N 

and  Iisk  hy  k(t,i)  *  1.  It  follows,  as  in  Part  (a),  that 
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|Bias[mw(t)]  |  =  |E[mw(t)]-m(t) |  <  2  wOnjay^t) ) 
where  now 


k+N 


aW,k(t)  =  l=k  (5T  '  hW,k(t,1)- 


i  k  2  1  k+N-1  .  ~ 

■  1)2  ♦  J  (J  - 1)2 


+  J{Wt-k)(^  -  t)2  • 

After  some  algebra  we  obtain 
N-l 


aW  k^  =  ft  l  “  (t-  g)]2  +  [N2  -  (2N-1)  (Wt-k)] 

w,k  N  j=l  w  w  NW^ 

- 1  r  (^)2 + ci-(wt-k)]. 

"  J=1  w  w^ 

Summing  up  the  first  term  above  and  noting  that  the  second  term  is  bounded  by  its 
2 

maximum  value  1/4W  (since  0  <  (Wt-k)  <1)  we  have 

^  i  *  * 


4WC  3W 


i  -77  • 
3W^ 


Thus  for  all  t  >  0 


|Bias[my(t)]  |  <  2  u)(m;  ). 

From  (16)  we  also  have 

k+N 


k+N 


Var[mw(t)]  *  ^  Vk^’^  Var^ZW,i^  -  l=k 


(14b) 


(18) 
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and  by  (17) 


k+  N 

l  h 2  fc(t,i)  «  4y  {  [1  - (wt-k ) ]2  +  (N-l)  V  (wt-k)2}. 

i=k  ’  n 

1  7  9 

Putting  x  =  (Wt-k)  we  have  0  <_  x  <1  for  which  -i?  <_  (1-x)  +  x  <_  1 , 

so  that 


k+N 


(19) 


It  follows  that  for  all  t  >  0. 
Var[mw(t)]  <  1 


(15b) 


so  that  by  (14b)  and  (15b)  we  have  for  all  t  ^  0 

E[mu(t)  -  m(t)]2  <  4  a>2(m;  )  +  1 

W  "  JZ  W  N 

and  the  final  result  for  sw(t)  follows  as  in  Part  (a).Q 

We  now  discuss  the  implications  of  Theorem  2.1.  The  first  term  in  the  bound 

is  due  to  the  bias  and  the  second  is  due  to  the  variance  of  the  estimates  (8). 

For  a  fixed  sampling  rate  W,  the  block  size  N  must  be  small  to  reduce  the  bias 

but  large  to  reduce  the  variance.  This  trade-off  is  standard  in  other  areas  as 

well  (e.g.  the  window-  bandwidth  parameter  in  spectral  and  probability  density 

estimation).  One  should  therefore  use  an  optimal  block  size  which  minimizes 

the  bound  on  the  mean-square  error.  Indeed,  when  s(t)  is  Lip  Y,  0  <  Y  <_  1 ,  i.e., 

u)(s;S)  =  D$<$y,  we  find  that  is  given  by  (the  integer  part  of) 


for  which  the  mean-square  error  becomes 


(20) 


when 
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For  the  (nonsequential)  estimates  considered  in  [4],  the  mean-square  error  was 
shown  to  be  c?( W-fn1  n(Y»1/2) j  anc^  thus,  the  estimates  of  the  present  paper  have 
faster  rates  of  convergence  for  all  0  <  y  ±  1.  For  instance,  when  the  signal 
s(t)  has  a  bounded  derivative  |s'(t)j  <  Dg  for  all  t  >  0,  then  Y=l,  the  optimal 
block  size  NQpt  =  o(W^)  and  the  mean-square  error  of  the  present  estimates  is 
0( W"2//3)  compared  to  <9(W~ 1  )  for  the  estimates  considered  in  [4],  It  may  be 

of  interest  to  compare  these  rates  to  those  of  other  comparable  schemes  which 
also  convert  a  continuous-time  signal  into  a  binary  sequence  by  periodic  sampling 
and  a  direct  2-level  quantization.  One  such  popular  scheme  is  the  standard 
delta  modulation  [12]  which  has  been  analyzed  for  stochastic  signals  only;  the 
most  comprehensive  analytical  study  was  carried  out  bySlepian  [12]  for  station¬ 
ary  Guassian  input  signals  with  rational  spectral  densities,  but  unfortunately 
no  closed  form  expressions  for  the  mean-square  error  and  its  rate  of  convergence 
as  W-*»  are  available.  The  only  case  we  are  aware  of,  for  which  such  closed  form 
expressions  are  available,  is  that  of  a  Wiener  process  input  [13].  In  this  case 
the  rate  of  convergence  of  the  steady-state  mean-square  error,  when  an  optimal 

step  size  is  used,  is  W’1  [13],  For  our  scheme  we  have  so  far  improved  the 
-1/2  -2/3 

rate  from  W  '  to  W  .  Note  that  the  sample  paths  of  a  Wiener  process  are 
almost  surely  continuous  but  not  differentiable  and  the  comparison  to  our  scheme 
with  Lip  1  signal  s ( t )  may  be  somewhat  questionable.  Still,  the  rate  of  con¬ 
vergence  W"1  of  a  standard  delta  modulator  with  a  Wiener  process  input  pro¬ 
vides  a  performance  measure  with  respect  to  which  the  performance  of  our  scheme 
can  be  compared.  It  remains  an  open  question  at  this  point  to  find  the  ultimate 
convergence  rate  possible  for  our  scheme;  we  conjecture  it  to  be  W"^  (see  Section 
IV)  but,  so  far,  we  have  not  found  the  recovery  scheme  (receiver)  which  achieves 
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this  rate. 


The  actual  sampling  rate  W,  needed  to  obtain  a  mean  -  square  error  smaller 

2 

than  a  given  level  5  ,  can  be  determined  from  (21).  For  example,  for  signals 
s ( t )  having  bounded  derivates  |s'(t)|  <  D  for  all  t  >  0  we  obtain  from  (20) 
and  (21) 

3l<2i/K7  D  3Ko 

W-  3  ;  Nopt  -  — 1  •  (22) 

25  opi  25 

Note  that  while  the  required  sampling  rate  W  is  proportional  to  the  variations 
paramenter  D  of  the  signal,  the  block  size  N  to  be  used  in  the  receiver  does 
not  depend  on  the  variations  of  the  signal. 

When  the  distribution  of  the  {X^}  in  the  transmitter  is  Gaussian  or  Laplacian, 
the  constants  K-j  and  <2  in  Theorem  2.1  depend  on  the  parameters  (a2,e)  under  (B2) 
and  on  (a,e)  under  (B3).  These  parameters  have  so  far  been  left  arbitrary  pos¬ 
itive  constants.  The  question  of  their  optimal  values,  which  minimize  the  mean- 
square  error,  is  now  discussed  when  the  signal  s(t)  is  Lip  Y,  0  <  Y  <  1.  From 

(21)  it  is  seen  that  the  mean-square -error  is  proportional  to  K1  .  Minimizing 

2Y  2 

K-j  K2  with  respect  to  (a  ,e)  under  (B2),  and  with  respect  to  (a,e)  under  (B3), 
yields 


a  =  (-^l^)1^2  (l+yg)b  ,  e  3  yQb;  under  (B2) 

cf1*  (-^—-)  (l+yQ)b  ,  c  3  yQb;  under  (B3) 

where  yg  is  the  positive  real  root  of  the  cubic  equation 
y3  =  ^yCy  +  y(2y+i)]. 

For  example,  for  Lip  1  signals  s(t)  the  optimal  choice  is 

a  »  2.8042  b  ,  e  3  1.2896  b  ;  under  (B2) 

a"^  3  4.344  b  ,  e  3  1.2896  b  ;  under  (B3) 


(23) 


i 
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It  Is  thus  seen  that  under  (B2)  or  (B3),  the  variance  of  the  { Xk>  in  the  trans- 

2 

mitter  should  be  chosen  to  be  proportional  to  b  ,  and  that  the  constant  e  in 
the  nonlinearity  g(x)  in  the  receiver  should  be  proportional  to  b. 

C.  Probability  One  Convergence  and  a  Central  Limit  Theorem 
We  first  obtain  bounds  on  the  higher  order  moments  of  the  error  sw(t )  -  s(t) 
which  provide  faster  rates  of  convergence. 

Theorem  2.2.  Let  Assumptions  (A)  and  (B)  be  satisfied  and  let  the  signal 
s(t)  be  Lip  Y,  0  <;  Y  _<  1.  If  the  block  size  N  is  of  the  form  N  *  A  W2^^  +  2Y^ 
(cf.  20),  then  for  every  integer  l  >_  1,  the  estimate  s^(t)  satisfies 

E[sw(t)  -  s(t)]  <  w2yZ7'(1~2y)  ' 

uniformly  in  t  >  0  for  some  constant  y  • 

Proof.  The  result  for  s w ( t )  -  s(t)  follows  by  Proposition  1  from  the  result 
for  my(t)  -  m(t)  which  we  now  establish.  Writing  for  brevity  m,m  for  mw(t),  m(t) 
we  obtain  from  m-m  »  Bias[m]  +  (m-E[mj) 

2i 

E[m  -ml2*  -  (BiasCm])2*  +  \  (2*)(Bias[m])2*"j  E[m  -  E|m])j.  (24) 

j=2  J 

Using  the  bound  on  the  cumulants  of  m  given  in  Proposition  2  in  the  Appendix  and 
the  fact  that  the  moments  of  m  can  be  expressed  as  finite  linear  combinations  of 
the  cumulants  of  in,  we  obtain,  as  in  the  proof  of  Theorem  4.2  of  [4],  that 

[ECS-.jJl  )>Z  (25) 

uniformly  in  t  >  0  where  H.  is  a  constant  and  [j/2]  is  the  integer  part  of  j/2. 

J 

Since  s(t)  is  Lip  Y,  so  is  m(t)  (cf,  (13))  and  by  (14a)  -  (14b)  we  have 

|Bias[m]|<  2  u)(m;  — — )  *  2D  / ~ — ]  .  (26) 

fS  W  m  W  / 
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Putting  N  «  A  W2^1  +  2Y^  in  (25)  and  (26)  and  substituting  in  (24)  we  obtain 


E[m  -  m]2^  (L  A*)2^* 


+  0  +  oO))  l  (^)L2*'j  H.  APj  J_ 

Ja2  J  J  q. 


where  the  constants  L  *  2D(n/3Y/2,  p^  =  Y(2£  -  j)  -  j  +  [j/2] 


*  ZU  j  V  -  2Yf.i/2~ 


Since  c*2n  TT2?  •  ^2n+1  =  TT~7r  *  "  s  1 *2 . 


it  follows  that  the  terms  in  the  sum  £  in  (27)  with  j  even  are  dominant,  and 

j=»2 

thus  by  (27) 

rA  YD+o(l )] 

ECm  *  i  ~Sfe/(i«r) 

uniformly  in  t  >  0,  where 


Kl,y  *  +  l  (fj)  L2(£'n)AP2n  HZn.Q 

n=l 


The  convergence  rate  cKW"2y^^1+2y^) given  in  Theorem  2.2  is  again  faster 
than  the  rate  c(W‘^  min(Y,l/2)j  obtained  earlier  for  the  nonsequential  estimates 
considered  in  [4],  Theorem  2.2  implies  the  convergence  with  probability  one  of  the 

A 

estimate  sw(t)  to  s(t)  as  W-*»  (i.e.,  corresponding  to  almost  every  realization  of  the 
sequence  (Xk}^.0  in  the  transmitter).  This  strong  consistency  of  sw(t)  together 
with  the  rate  of  convergence  is  given  in  the  following. 


Theorem  2.3.  Let  Assumptions  (A)  and  (B)  be  satisfied,  let  the  signal  s(t)  be 

2y/(1  +  2y) 

Lip  y»  0  <  y  <  1,  and  assume  the  block  size  N  to  be  of  the  form  N  *A  W 
Then  for  each  fixed  t  >  0  we  have  with  probability  one 
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■<x> 


(wn)8  sup  ! Sw( t )  -  s( t )  I  — —  0  as  wn— « 
u  W  >  W0  w  u 

for  every  constant  0  satisfying  0  <  9  <  -ppjy  • 

Proof.  We  note  that  for  each  fixed  t  >  0,  the  estimate  sw(t),  regarded  as  a 
random  process  with  parameter  W  >  0,  is  separable.  The  result  then  follows  from 
Theorem  2.2  and  Kolmogorov's  theorem  (Neveu[14,  p.  97])  in  the  manner  of  the  proof 
of  Theorem  4.3  in  [4].  Q 

For  example,  when  s(t)  has  a  bounded  derivative  then  y  »  1  and  with  probability 
one  we  have,  in  particular, 

W9  |sM(t)  -  s ( t)  | - —0  as  W  — — ® 

for  all  0  <  9  <  1/3.  for  the  estimates  considered  in  [4]  we  obtained  0  <  0  <  1/4 
for  the  same  example  and,  thus,  the  sequential  estimates  of  the  present  paper 
have  a  faster  rate  of  almost  sure  convergence. 

We  finally  derive  a  central  limit  theorem  for  the  estimation  error  sw(t)  -  s(t) 
which  is  useful  In  obtaining  confidence  intervals  for  this  error,  Define  the 
normalized  error  process 

sw(t)  -  Bw(t)[sw(t)  -  s(t)],  t  >  0 

where 

6w(t)  =  u’Cs(t)]  Var’1/2[mw(t)].  (28) 

In  the  following  we  shall  assume  that  under  (81),  X  is  uniform  over  [-c,c]  with 

A  A 

c>b  in  which  case  the  estimate  (10)  is  replaced  by  sy( t )  =  c  m^t). 

Theorem  2.4.  Let  Assumptions  (A)  and  (B)  be  satisfied  and  let  the  signal  s(t) 

be  Lip  y»  0  <  y  <  1 .  Assume,  in  addition,  that  under  (Bl),  X  is  uniform  over  [-c,c] 

with  c>b.  If  the  block  size  N  Is  chosen  to  be  of  the  form 
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N  *  A  WA,  0  <  X  < 


T7Z7  • 


then  for  each  fixed  t  >.  0,  ?w(t)  Is  asymptotically  standard  normal  variable  as  W-»» 
and  for  distinct  t's  the  values  of  the  process  Uy(t),  t  >  0}  are  asymptotically 
Independent. 

The  normalizing  factor  8y(t)  Is  bounded  from  above  and  below,  uniformly  In 


t  >  0,  as 

follows: 

For  (8a): 

M1 

N1/2^ 

V') 

For  (8b): 

M1 

N^2< 

8W(t| 

where 

for  (Bl): 

M1 

*  1/c 

for  (82): 

M1 

*  Jlho"  exp( 

for  (83): 

M1 

*  a  exp( 

-ob) 

1-0/2N) 


;  m2  »  i/7c2  -  b2  , 


.2  /o_2 . 


for  (83):  M]  *  a  exp(^b)  ;  *  ct  exp(ab/2)  [2-exp(-ab)J’1/2. 

Proof.  We  prove  the  asymptotic  normality  and  independence  of  the  normalized 
error  process  for  m(t): 


mu(t)  -  m(t) 

V t)  ■  -Yyg  •* - 

^  Var1/2[my(t)] 


,  t  >  0. 


The  corresponding  results  for  the  signal  s(t),  as  stated  In  the  theorem,  will  then 
follow  by  using  a  result  of  Mann  and  Wald  [15,  p.226]  in  the  manner  of  the  proof 

of  Theorem  3.4  In  [4].  Putting 

myUJ-EOnyU)] 


Sw(t) 


Var,/i[inM(t)] 


t  >  0 
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m«  Have 


B1as[my(t)] 

"w(l>  ’  V‘>  *  ra 


,  t  >  0. 


Var  '  [njy(t)] 

The  proof  Is  accomplished  by  showing  that  as  W-m>  the  second  term  tends  to  zero  and 
€y(t)  is  asymptotically  standard  normal  variable  with  asymptotically  independent 
values  for  distinct  t's. 

We  have  V,  <  Var[Zy  .]  £  1  where 

Vo  »  min  Var[sgn(s+X)]  *  1  -  max  u  (s) 

I  s  |<b  |sj<b 

*  l-u2(b). 

V2  is  easily  calculated  under  (B2),  (B3)  and  the  modified  (Bl)  and  in  all  cases 
V2  >  0.  Thus  by  (15a)  for  the  estimate  (8a)  and  (18)  and  (19)  for  the  estimate 
(8b)  we  have 


for  (8a): 


V,  -  i 

jj-  <  Var[m^(t)]  <  ^ 


(30a) 


for  (8b):  (1.  <  var^u(t)]  <  (30b) 

Hence  by  (14a)  and  the  lower  bound  in  (30a)  we  have  for  the  estimate  (8a) 


|B1as[mw(t)]| 

Var1/2[mw(t)] 


(f22)- 


by  assumption  on  N.  Similarly  for  the  estimate  (8b)  (cf.  (14b)  and  (30b)). 


We  now  consider  5y(t).  Clearly  E[£w(t)]  •  0  and  Var[?w(t)]  *  1  for  all 
t  >  0.  Also  for  r  >  3  and  all  Instants  >0  (not  necessarily  distinct) 

we  have 


Cumr { Cy( t, ),..., Cw(tp) } 


* 


Cumr{my(t^ ) . my(tr)} 

r  T/7~t 

^Var  [ntyUj)] 
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and  using  Proposition  2  for  the  numerator  and  the  lower  bound  in  (30a)-(30b) 
for  the  denominator  we  find 

Cumr(cw(t1 Sw(tr)}  =  Oastf« 

since  r  >  3.  Finally,  for  t}  i  t2  we  have  ECCw(t1 )Cw(t2)]-^  0  as  W^»  since  for 

large  enough  W  (say  ^- <  jt^  -  t2|),  ?w(t|)  and  Cw(t2)  are  independent  as  ^(t^ 

/\ 

and  my(t2)  are  expressed  in  terms  of  nonoverlapping  blocks  of  N  Zw  ^'s.  It  then 
follows  by  Lemma  P4.5  of  [16]  that  the  finite  dimensional  distributions  of 
(5y(t),  t  >  01  converge  as  W-*»  to  the  finite  dimensional  distributions  of  a  Gaus¬ 
sian  process  with  mean  zero  and  covariance  R(tj,t2)  ■  1  for  t^  *  t2  and  R(t.|,t2)=0 
for  t^  t  t2;  which  establishes  the  desired  result  for  my(t). 

The  bounds  (29)  for  the  normalization  factor  8w(t)  follow  from  (28),  (30)  and 
the  observation  that 

u‘(b)  <,u'[s(fc)]  <_  u 1  (0) .  CD 

III.  TRANSMISSION  OVER  A  NOISY  CHANNEL  -  ERROR  ANALYSIS 

In  this  section  we  study  the  degredation  in  the  performance  of  the  receiver 
of  Section  II  when  the  binary  data  (Zw  is  transmitted  over  a  noisy  channel. 

The  modification  in  the  transmitter/receiver  structure  of  Figure  1  is  shown 
in  Figure  2.  The  binary  sequence  (Zw  is  now  pulse  modulated 

p(o  ■ 

where  a(t)  is  the  transmission  filter,  i.e.,  a(t)  is  a  fixed  function  over  [0,1/W], 
vanishing  outside  of  [0,1/W],  with  finite  energy 
1/W 

eW  *  [  la(t)|2dt. 

i 

0 
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The  power  of  the  transmitter  is  then 


1  /W 

\  »  Wew  -  W  j  ]a(t)|2dt.  (31) 

0 

A  possible  choice  for  a(t)  is,  for  example,  a(t)  »  C  sirmWt  1  j-q  ^/Wj(t)  for  which 
p(t)  resembles  a  PSK  waveform  and  the  transmitter  power  is  then  Pw  *  C2/2.  The 
channel  noise  (n(t),  is  assumed  to  be  a  wide-sense  stationary  process, 

independent  of  the  sequence  (X^),  with  mean  zero  and  covariance  function  R(t). 
The  channel  noise  need  not  be  Gaussian  nor  white.  The  received  waveform  is  then 


r(t)  «  p(t)  +  n(t) ,  t  >  0. 


The  modification  in  the  receiver,  as  shown  in  Figure  2,  is  based  on  the  simple  idea 

of  first  estimating  Zy  k  from  the  received  waveform  r(t)  over  the  k—  interval 

k  k+1  ~ 

and  then  using  the  estimates  {Zy  k>  as  the  input  to  our  previous  (noise¬ 
less  channel)  receiver.  This  is  accomplished  by  employing  a  standard  matched  filter 
whose  output 


T 


k 


(k+1 )/W  . 

a(t-  g)  r(t)  dt,  k*0,l , . . . , 
k/W 


(32) 


is  used  to  estimate  Zy  k  by 


ZW,k  *  ’  k  *  O’1’--- 


(33) 


where  the  transformation  i»(x)  is  specified  below.  Hence  the  estimate  sw(t)  of  s(t) 
is  given  by  (9) 


sw(t)  *  gCmyft)],  t  >  0  (9) 


where,  as  before,  g(x)  is  specified  In  Assumption  B  but  with  my(t)  is  now  determined 
from 


N-l 


"wW  “  N  [aQ  ZW,  k+i 


k  ■  0,1 . 


(7') 
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T"*- 


T 


by  (8a)  or  (8b).  For  convenience  of  notation  we  shall  write  my(t)  in  the  form 

(81) 

where  the  kernel  { hw ( t , i ) }  can  easily  be  identified  for  the  sequential  estimates 


<ny(t)  =  £  hy(t,i)  »  t  >  0 


(8a)-(8b) . 

In  fact,  the  analysis  in  this  section 

is  valid  for  any  (hw(t,i)} 

corresponding  to  a  positive  linear  operator  with 

i . 

hy(t,i)  >  0  for  all  t  >  0,  i  >  0, 

00 

i  i . 

l  h  (t,i)=  1  for  a11  t  >  0, 
i=0  w 

(34) 

00  0 

iii. 

l  i  hu(t,i )  <  »  for  all  t  >  0, 
i=0  w 

iv. 

00  ? 

y  hf, (t,i)  <  «  for  all  t  >  0. 
i=0  w 

Consider  now  the  question  of  choosing  the  nonlinearity  Y(x)  in  (33).  It  is 

natural  to  base  the  choice  on  minimizing  the  mean-square  error  E[ZU  . -Zu  .  ]^ 

under  the  constraint  that  Zw  k  takes  on  the  values  ±1.  For  example,  when  the 
channel  noise  is  white  and  Gaussian  with  R(t)  =  (vQ/2)6(t),  the  analysis  gives 

A 

t(x)  *  sgn  x  (i.e.,  Zw  ^  =  sgn[Tk])  which  corresponds  to  the  classical  optimal 
detector  (nonparametric  in  the  signal  s(t) )-see  Part  C  of  Appendix  -.  For  this 
case  one  finds  E[ZW  k  -  Zw?k]2  =  4[l-<j>(dw)]  where  dy  =  /2ew/vQ  is  "the  signal 
to  channel -noise  ratio".  However,  as  seen  from  the  Appendix,  this  optimal  de¬ 
tector  for  the  symbol  Zw^k  is  not  necessarily  optimal  as  far  as  the  estimation 
of  s(t)  is  concerned.  In  fact,  the  linear  choice  f(x)  =  x/ew,  which  gives  the 
worst  possible  error  E[Zy  k  ’  k^  =  ^  unc*er  the  condition  d^=l ,  provides  a 
smaller  degradation  in  the  estimation  of  s(t)  than  the  "optimal"  f(x)  «  sgn  x. 
Consequently,  we  shall  assume  in  this  section  that 


r~  »  k  =  0,1,... 
eW  ' 


(35) 
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which  is  simply  a  scaling  of  the  matched  filter  output. 

We  have. 

Theorem  3.1.  Let  Assumptions  (A)  and  (B)  be  satisfied  and  the  estimate  sw(t) 
be  given  by  (9)  and  (8‘)  with  Zw>.  given  by  (35)  and  (hw(t,i)}  satisfying  (34). 
Then 

a)  When  the  channel  noise  is  white,  with  R(t)s  (vg/2)6(t),  we  have  for  each 
fixed  t  >  0, 

E[sw(t)  -  s(t)]2  <  K1  oAsuiyU))  +  K2(l  +  \)  Vy(t). 

dW 

b)  When  the  channel  noise  is  colored,  with  arbitrary  continuous  correlation 
R(t),  we  have  for  each  fixed  t  >_  0, 

E[sw(t)  -  s(t)]2  <  K1  A;aw(t))  +  <2(v2(t)  + 

where  in  both  parts  (a)  and  (b) 

ciy(t)  3  (t  *g)  hw(t,i), 

QO 

vS(t)  3  l  h2(t,i), 
w  i»0 

dW  *  /2eW/v0  ’ 

and  the  constants  K,  and  <2  are  as  in  Theorem  2.1. 

Corollary  3.1 .  Assume  (hw(t,i)}  corresponds  to  the  sliding  windows  of 
Section  II  (cf.  (8a)-(8b)) ,  then 

a)  When  the  channel  noise  is  white,  the  sequential  estimates  sw(t)  satisfy, 
uniformly  in  t  >  0, 
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E[Sy(t)  -  s(t)]2  <  K,  a»2(s;JL_  )+  ^  (1  +  J*.  ) 
w  1  /J  W  N  dj 

A 

b)  When  the  channel  noise  is  colored,  the  sequential  estimates  Sy(t)  satisfy, 
uniformly  in  t  ^  0, 

E[sw(t)  -  s(t)]2  <  K,  o,2(s;-!^-)+  K2(l+  §^-). 

w  w 

A  comparison  of  Corollary  3.1  with  Theorem  2.1  shows  that  the  channel  noise 
simply  increases  the  variance  of  the  estimate  sw( t )  by  a  factor  inversely  propor¬ 
tional  to  the  "signal  to  channel-noise  power  ratio"  (e.g.  Wew/R(0)  in  the  colored 
noise  case). 

Proof.  By  (32)  we  have 

1 /W  1 /W 

Tk  =  ZWik  j  I  a ( t ) j2dt  +  j  a(t)  n(t-  £)  dt 

0  0 

so  that  by  (35) 

2W  k  *  2W  k  +  ^k*  ^  =  (36) 

where  {^k>  is  a  wide-sense  stationary  sequence,  independent  of  (Zy  k>,  with  mean 
zero  and  covariance  sequence  Pn  =  E[cn+ki;kJ  given  by 


Since  the  Zw  k's  are  independent  with  mean  m(k/W)  and  Var[Zw  k]  <  1 ,  we  have 
by  (36) 

E[Zw,kJ  *  m(k/W) 
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and 


lCov{zw,i’zw,j>l  i 


61-  j  +  Ip^.j-I  .  noise  is  colored 


0  1r  ,  noise  is  white. 


c'  +^]su 


Hence  for  the  estimate  m^t),  given  by  (8'),  we  have 
ECmyCt)]  =  J  hw(t,i)  m(i/W) 


and 


(38) 


(39) 


Var[mVj(t)]  j  ^  q  Cov{Zy  •  ,ZM ^ }  hy(t,i)  hy(t,j).  (40) 

Since  m(t),  t  ^  0,  is  a  uniformly  continuous  function,  we  have  by  the  result  of 
[11],  on  the  interpolation  of  continuous  functions  by  positive  linear  operators, 
and  (39)  that  for  each  fixed  t  >  0 


|Bias[mw(t)]|  <  2  uj(m;aw(t) ) 


(41) 


where  a£(t)  is  given  in  the  theorem.  For  the  variance  expression  (40)  we  consider 
the  white  noise  case  first.  Then  by  (38) 


Var[mw(t)]  <  (1  +-^)  £  hZ(t,i)  *  (1  +  4?-)  vZ(t). 

dW  1=0  dW 

When  the  channel  noise  is  colored  we  have  |p.  .  |  £  pQ,  and  by  (37) 

O0<  [  }/Wla(t)|dt]2  <  gi2I 


(42a) 
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where  we  have  used  the  Cauchy-Schwarz  inequality  in  the  last  step.  Hence 
by  (38)  and  (40) 

Var[mw(t)]  <  J  hy(t,i)  +  ^-  C  l  hw(t,i)]2 

w  i=0  weW  i=0  w 

=  ^  +  Si-1  <«*> 

where  the  last  step  follows  by  (34. ii).  We  thus  have  by  [41)  and  (42), 

14tu2(m;ctw( t) )  +  (1+  4)vJ(t).  noise  is  white 
dW 

4<1)2(m;aw(t))  +  vjj(t)  +  j^-,  noise  is  colored 

A 

and  the  result  for  s^(t)  follows  from  (13)  and  Proposition  1  in  the  Appendix.D 
The  implications  of  Corollary  3.1  are  now  considered  in  more  details.  In 
the  case  of  colored  noise,  the  contribution  of  the  channel  noise  to  the  mean- 
square  error  of  sw( t )  is  the  additive  term  K2R(0)/Wew  =  K2R(0)/  fy,  where  Pw  is 
the  power  of  the  transmitter.  Thus  to  combat  the  channel  noise,  Pw  must  be  pro¬ 
portional  to  WA  for  some  \  >  0.  In  the  case  of  white  noise,  the  contribution  is 
the  additive  term  Kj/Nd^  =  If  the  block  size  N  is  chosen  to  be 

0(w2y/(1+2y)}  fQr  up  ^  signals  (cf.  NQpt  given  in  (20)),  then  Pw  must  be  proportional 
to  w[2y/(1+2y)]  +  ^  for  Some  \  >  0  in  order  to  combat  the  channel  noise.  It  is 
then  clear  that  the  transmission  power  Pw  must  be  appreciably  higher  in  the  white 
noise  case  than  in  the  colored  noise  case  for  the  same  channel  noise  contribution 
to  the  mean-square  error  of  sw(t).  It  is  also  of  practical  interest  to  obtain  the 
values  of  the  parameters  (W,N,  P)  for  the  reconstruction  of  the  signal  s(t)  to 
be  achieved  with  mean-square  error  not  exceeding  a  given  level  $  .  For  simplicity 
we  carry  out  the  analysis  for  signals  s(t)  having  bounded  derivatives  |s'(t)|<  d 
( i . e . ,  s(t)  €  Lip  1)  and  a  sinusoidal  pulse 
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C  sirmWt ,  0  <  t  < 


1 


a  ( t )  = 


-  L  -  W 
elsewhere 


(43) 


is  used  as  a  transmission  filter. 

2  2 

a)  White  Channel  Noise.  Choose  C  =  vQW.  Then  =  1  and  the  channel  noise 


simply  doubles  the  variance  of  the  estimate  s^(t).  We  have  by  Corollary  3.1 
E[sw(t)  -  s(t)]2  <  k7  d2|-|^  J2  +  nr  • 


Minimizing  the  right  hand  side  with  respect  to  N,  we  obtain  the  optimal  block  size 
to  be  (the  integer  part  of) 


[  K-jD 


for  which 


2  /Kl4  °2V/3/l  \2/3 
ec»w(t>  -  s(t)]2  <  )  (J)  • 


Hence,  for  the  mean-square  error  to  be  less  than  6  ,  we  require  the  values  of 
(W,N,C2)  to  be 


W  >  3vXjr  K2(D/6J),  N  *  3K2(l/62)  ,  C2  =  3  v0  /ITf  K2(D/63) 


(44) 


Under  (Bl)  -  the  simplest  transmitter/receiver  structure  -  these  values  are 

W>6b2(D/63)  ,  N  -  3b2(l/<$2)  ,  C2  =  6  vQ  b2(D/63).  (45) 

b)  Colored  Channel  Noise.  By  Corollary  3.1 
E[sw(t)  -  s(t)]2  <  K,  fyJL-f  +  k2(1  +  ^21)  . 

Minimizing  the  right  hand  side  relative  to  N,  we  find 


26 


w2/3 


and  if  we  demand,  as  in  case  (a),  that  the  noise  contribution  simply  doubles  the 
variance  we  find 


2R(0)N 


opt 


2  Z 

and  for  the  mean-square  error  to  be  less  than  5*  we  need  the  values  of  (W,N,C  )  to  be 


W  >  /1 25/12  ^  K2(0/63)  ,  N  »  |  K2(l/S2)  ,  C2  ■  5R(0)  K2(l/62). 

(46) 

2 

Comparing  (44)  with  (46)  we  note  that  the  transmitter  power  C  in  the  white  noise 

case  is  proportional  to  the  variation  parameter  D  of  the  signal  and  to  ( 1 / 63 ) 

whereas  in  the  colored  noise  case  the  transmitter  power  is  independent  of  D  and 
2 

proportional  to  1/6  only. 


IV,  COMMENTS 

We  point  out  some  open  problems  connected  with  the  reconstruction  scheme 
considered  in  this  paper. 

We  first  note  that  the  results  of  this  paper  generalize  and  sharpen  those 

of  [4].  In  particular,  the  mean-square  convergence  rate  obtained  here  is  W~2^3 
-1  /2 

compared  to  W  '  for  the  nonsequential  estimates  of  [4].  An  open  problem  is 
therefore  to  find  the  ultimate  mean-square  convergence  rate  of  any  recovery 
scheme  (sequential  or  not)  based  on  the  binary  data  {Zw  ^ )k .  We  believe  this 
rate  to  be  W"1  for  nonconstant  signals  s(t)  (when  s(t)  is  constant  for  all  t, 
the  problem  is  trivial).  One  reason  for  this  belief  is  the  nature  of  the  trade¬ 
off  between  bias  and  variance  in  Theorem  2.1  (as  a  function  of  the  block  size 
N)  which  Is  reminiscent  of  a  similar  trade-off  in  spectral  and  probability  den¬ 
sity  estimation  (as  a  function  of  the  window- bandwidth  parameter).  This  problem  is 


2 


currently  under  investigation. 

A  second  open  problem  is  to  extend  the  results  of  this  paper  to  the  case 
where  the  signal  s(t)  is  not  necessarily  uniformly  bounded.  It  is  clear  that 
in  such  a  case  the  contaminating  sequence  {X^}  should  have  a  strictly  mono¬ 
tonic  distribution  F(x)  over  (-»,»)  -(e.g.,  Gaussian).  Theny(s)  is  strictly  mono¬ 
tonic  on  (-»,»),  s(t)  =  u'^OnU)],  and  we  set  s(t)  =  y”1  [m(t ) ] .  Now  it  is 
possible  to  show  that  m(t)  converges  to  m(t)  with  probability  one  and  thus, 
also,  s(t)  to  s ( t )  since  u'^x)  is  a  continuous  function.  The  main  problem 
in  this  case  is  to  obtain  bounds  on  the  mean-square  error  for  s(t);  the  difficulty 
being  that  such  bounds  cannot  be  obtained  from  those  for  m(t),  as  in  the  proof 
of  Theorem  2.1,  since  u_1(x)  is  not  Lip  1  on  [-1,1]. 

The  question  of  extending  the  results  of  this  paper  to  stochastic  signals 
is  currently  under  investigation. 


APPENDIX 

Collected  here  are  two  propositions  needed  in  the  proofs  of  the  theorems  in 
Section  II  as  well  as  a  supplement  to  the  white  channel  noise  case  of  Section  III. 

A 

The  first  proposition  provides  the  link  between  the  properties  of  sw(t)  and 
mw(t). 

A.  Proposition  1.  Under  Assumptions  (A)  and  (B)  we  have:  Under  (Bl) 
sw(t)  -  s ( t )  =  b[my(t)  -  m(t)] 
and  under  (B2)  or  (B3)  we  have  for  each  integer  p  >  1. 

E 1  sw( t)  -  s(t)lP<  Ap  EjmyU)  -  m(t)|P 
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where 


for  (82):  Ap  =  (tt/2)p/2  <jP  JqZ/Z°Z  [1  +  (b/e)p] 

for  (83):  Ap  =  a'p  e^0  [1  +  (b/e)p]. 

Proof.  See  Proposition  5.1  in  [-4] .□ 

The  second  proposition  provides  upper  bounds  on  the  cumulants  of  the  estimate 

A 

n>w(t). 

B.  Proposition  2.  For  every  integer  r  >  2  and  every  choice  of  instants 

tl»---»tr  >  0,  the  joint  cumulant  of  order  r  of  the  estimates  m^t)  (cf.  (8a)-(8b)) 

satisfies 

r 

|Cumr(mw(t1 ),...,  mw(tr)}|  < 

N 

uniformly  in  {t.}  for  some  finite  constant  r  . 

J  r  ,  ,  , 

k .  k  .+1 

Proof.  Assume  without  loss  of  generality  that  t^  g  [-^  ,  -^j-),  j  =  1 . r, 

where  the  integers  kj . kr  are  not  necessarily  distinct.  Then  we  can  write 

k.+N 

"w(V  '  Jlc.  Vk/V1’ 

J 


where  for  the  estimate  (8a) 
h, 


,N-1 


(  j_  ,  i  =  0,1 , , 

W»kj  ^ j’ 1  ^  =  j  N  (Al) 

(o  ,  otherwise 

and  for  the  estimate  (8b)  they  are  given  by  (17).  By  Proposition  4.2  of  [4]  we 
have 


|Cumr{mw(t1),...,mw(tr)}|  <  Tr  J  IT  (tj,1) 

i  ^  I  j  1  j 


(A2) 


where 


1  *  nj»1  !j  ;  +  N)' 
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For  the  estimate  (8a)  we  have  by  (AT) 


I 

l€l 


r 

n 


J-i 


(tj.o 


L.  -V 


igl*  N 


1 


where  I'  ■  n^_.  +  N-1 }  and  the  inequality  follows  from  the  cardinality 

of  I'  being  at  most  N.  for  the  estimate  (8b)  we  have  from  (A2)  and  the  r—  dimen¬ 
sional  version  of  Holders  inequality 

r  1/r 

j  Cumr{my(  t-j )  i « • « jfiiy(  t^  )}|  i  rr  n  {  l  ChM>k.(tri)]r}  .  (A3) 

J “ 1  J 

Now  by  (17) 


I  Ow  k  ( ti ,i )3r  i  l  [h, 
i£l  w,Kj  J  i€l: 


W,k.(tj,i^r 

J 


=  -V  {p-(Wt.  -  k  )]r  +  (N  -  1)  +  (Wt .  -  k  )r) 

Nr  J  J  J  J 

<  -U 

-  ^r=T  (A4) 

where  the  last  step  folows  from  (l-x)r  +  xr  <  1  for  0  <  x  <  1 .  The  result  now 
follows  by  substituting  (A4)  in  (A3).Q 

C.  Supplement  to  Section  III.  Assume  the  channel  noise  (n(t)  ,-<*<t«»}  to  be  white 
and  Gaussian  with  R(t)  »  (vg/Z)  6(t).  Let 

A 

ZW,k  *  S9n  ^Tk  " 


/N  2 

for  some  threshold  e  which  minimizes  E[ZW  k  -  Zw  kJ  .  It  is  not  difficult  to  see 
that 


*  4Pe 
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where  Pe  is  the  probability  of  error  per  symbol  given  by 
Pe  *  */  — — [1  +.  m(k/W_)j  +  p  .  $1  9  J  f. 

vw2 >  \ W2' 


m(k/W) 


the  value  of  8  which  minimizes  Pe  is 

1  »_  1  -  m(k/W) 

9  tn  1  +  m(  k/W) 

which  is  dependent  on  m(k/W)  =  u[s(k/W)].  Since  s(t)  is  unknown,  a  nonparametric 
choice  of  0  is  9  *  0  for  which 


ZW,k  *  s9nCT(cJ 


«zw,k  -  zu,k]  • 


where 


du  ■ 


We  then  have 


Proposition  C.  Under  the  assumptions  of  Theorem  3.1,  but  with  Zy  k  given  by  (A5), 
we  have  when  the  channel  noise  is  white  and  Gaussian 

E[sw(t)  -  s(t)]2  <  K^{u(s;aw(t))  +  ^l-*^)]}2  +  K2v£(t) 

p  p 

where  a^(t),  v  y(t),  and  the  constants  Kj  and  K2  are  as  in  Theorem  3.1  and  the  constant 
Q  is  given  by  (12). 

Corollary  C.  Under  the  assumptions  of  Corollary  3.1,  we  have,  uniformly  in  t  ^  0, 

E[sw(t)  -  s(t)]2  <  I^Ms;-^ - )  +  J[l-$(dw)]}2  +  / 

W 
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Proof.  The  derivations  proceed  in  the  manner  of  the  proof  of  Theorem  3.1  noting 

A 

that  for  Zy  ^  given  by  (A5),  we  now  have 
E[Zw>iJ  »  [2$(dw)  -  1]  m(i/W) 

sid*sij  • 


Then 


|Bias[my(t)] |  <  2[2$(dw)  -  1]  oj(m;aw(t))+  2[1  -  $(dy)] |m(t)  | 
<  2{u)(m;oiy(t) )  +  [1  -  *(dy)]} 


and 


Var[mw(t)]  <  [  hy(t,i)  =  Vy(t) 


and  the  result  follows.O 

a 

Note  that  the  channel  noise  here  increases  the  bias  of  the  estimate  Sy(t)  in 

contrast  to  Theorem  3.1  (a).  Corollary  C  implies  that  the  additional  term,  due  to 

the  channel  noise,  becomes  negligible  only  when  the  "signal  to  channel-noise  ratio" 

dw  tends  to  infinity  as  W-*».  In  sharp  contrast.  Corollary  3.1  (a)  implies  that  the 

2 

additional  term  l^/dy  N,  due  to  the  channel  noise,  becomes  negligible  as  W-»°  (and  thus 

a 

N-*«)  even  if  dy  ■  1,  say.  Thus,  with  dy=l,  the  noise  contribution  when  Zy  k  is 
nonoptimal  (35)  will  be  smaller,  for  large  W,than  when  Zy  k  is  optimal  (A5).  This 
conclusion  holds  eventhough  with  dy  *  1  we  find  for  the  nonoptimal  estimate  (35) 
that  (cf.  (36),  (37)) 
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Fig.  1  The  structure  of  the  transmitter/ receiver  model  -  noiseless  channel 
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Fig.  2  Modification  of  the  transmitter/receiver  in  the  presence  of  channel  noise 
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